﻿ Co-operation and Detachment in Discourse Understanding Dan CRISTEA Faculty of Computer Science 16, Berthelot St Romania valyt@mail dntis ro {catab, tensor, hoana, clau, cnita, danut, mary}@fenrir infoiasi ro dcristea@infoiasi ro Abstract The paper presents the architecture and behaviour of a system that integrates several ideas from artificial intelligence and natural language processing in order to build a semantic representation for discourse It is shown how modules that can contribute with different kinds of expertise (syntactic, semantic, common sense inference, discourse planning, anaphora resolution, cue-words and temporal) can be placed around a skeleton made up of a POS/morphological tagger and an incremental discourse parser The performance of the system is affected but is not vitally dependent of any of the contributing expert modules represents an interpretation of the message 1 Introductioncommunicated by the discourse In Cristea and Webber (1997) it is shown how a discourse can be represented by a family of discourseVT operates with two concepts: head and vein, which are trees, each of them reflecting a specific interpretation expressions of terminal labels (discourse unit labels) Building of a tree is an incremental process based on twoattached to each node in the discourse structure Head elementary operations inspired from L-TAGs (Joshiexpressions are computed bottom-up: (1987), Schabes (1990)) Adjoining adds to the current1 The head of a terminal node is its label discourse tree (CDT) an auxiliary tree with at least one2 The head of a non-terminal node is the non-empty node Substitution unifies the root of aconcatenation of the heads of its nuclear children substitution structure with an empty node of a CDT Veins expressions are sub-sequences of the sequence of These operations can be performed only on theunits labels making up the discourse and are computed generalised right frontier of the CDT top-down: 1 The vein expression of the root is its head The Veins Theory (VT) (Cristea, Ide and Romary2 For each nuclear node whose parent node has vein v, (1998b)) uses a representation of the discourse structurethe vein expression is: as a binary tree, following Marcu (1996), Cristea and• if the node has a left non-nuclear sibling with Webber (1997), without loss of generality Terminalhead h, then seq(mark(h), v), where mark(x) is a nodes in the tree represent discourse units and non- function that takes a string of symbols x and terminal nodes represent discourse relations Usually a returns each symbol in x marked in some way unit is uniquely identified by a label A polarity is(e g , with parentheses) and seq(x, y) is a established among the children of a relation (Mann and sequencing function that computes that Thompson (1987)), which identifies at least one node, permutation of x concatenated with y that is the nucleus, considered essential for the writer’s given by the left to right reading of the sequence purpose; non-nuclear nodes, which include spans of text of labels in x and y on the terminal frontier of the that increase understanding but are not essential to the tree writer’s purpose are called satellites Following also • otherwise, v other modern approaches (Gardent (1997), Schilder (1997)), we believe that a discourse tree of this kind3 For each non-nuclear node of head h whose parent node has vein v, then the vein expression is: 1• if the node is the left child of its parent, thenby each source is integrated through a parsing process seq(h,v);that allows the maximum liberty degree that still obeys • otherwise, seq(h, simpl(v)), where simpl(x) is awith all the restrictions evidenced function that eliminates all marked symbols from its argument, if they exist 2 Overview of the approach The overall philosophy is that of a system made up of a The domain of accessibility of a unit is defined as thecollection of modules that co-operate on a client-server string of unit labels appearing in its vein expression andbasis Each module can contribute with its specific prefixing that unit label itself VT makes two importantexpertise in a common enterprise When a module needs claims The first regards discourse cohesion: referencesinformation from another module, it (the client) sends a from a given unit are possible only in its domain ofrequest and waits for an answer If the interrogated accessibility The second, which regards discourseexpert cannot answer or is unavailable, the interrogating coherence and extends the classical Centering Theorymodule can still proceed, but its results are less precise (CT) (Grosz, Joshi, and Weinstein (1995)) at globalWhile trying to reply, the server might need itself some level, regards the inference load for processing globalexternal data, so, as a client, launches another request, discourse Accordingly a “ smoothness” index can beand so on The way this system is built does not allow used to compare different discourse structures andcircular waiting lines interpretations CT defines a set of transition types for discourse (Grosz, Joshi, and Weinstein (1995); Brennan,There are two core components: Friedman and Pollard (1987)) If a smoothness score for• a Part Of Speech Tagger (PosTag) - which splits1 a transition is considered:the text in discourse units and provides morphological markers for each word; Transition Score• an incremental Discourse Parser (DiscPar) - which CONTINUATION4builds the associated discourse tree(s) RETAINING3 A set of optional knowledge sources could be plugged into the system: SMOOTH SHIFTING2• a Cue-words Expert (CueExp) - provides hints ABRUPT SHIFTING1with respect to unit delimitation, identification of rhetorical relations and their place of insertion based NO Cb0on a collection of cue-words heuristics; • a Syntactic Expert (SynExp) - marks syntactic then the scores for each transition in the entire segment components; are summed up, and the result is divided by the number • a Semantic Expert (SemExp) - builds the of transitions in the segment, an index of the overall associated semantic net for each discourse unit, coherence of the segment, called global smoothness using a dictionary which contains morphological score, is obtained VT claims that the global smoothnessand semantic information Each entry has an score of a discourse when computed following theassociated semantic-net pattern; neighbouring metric given by vein expressions,• a Semantic Inference Expert (InfExp) - makes following VT, is at least as high as the score computedcommon-sense inferences based on the nets built by following the adjacency metric recommended by CT BySemExp using a set of common sense reasoning this VT claims that long-distance transitions computedrules; using vein expressions are systematically smoother than • a Reference Resolution Expert (RefExp) - solves accidental transitions at segment boundaries We use the pronominal and functional anaphora; global smoothness score as criteria of acceptability of a • a Planning Expert (PlanExp) - validates discourse structure Using VT as a guiding theory in an argumentation lines (comprehensible sequences of incremental discourse parsing approach is based on the discourse units) with respect to the planning theory assumption that a vein expression represents a piece of It uses a database which contains for every action discourse having its own meaning (an argumentation (verb) the description of specific planning line, a logical chain) and all the veins in the discourse information; tree concur at building the global meaning of the • a Temporal Expert (TempExp) - estimates the discourse appropriate moment in time (logical time) for the The paper describes an approach aiming at incrementallyactions that occur deriving a discourse tree where, at each step, the current The general architecture of the system, as well as the unit contributes with a new piece of structure, as in data flow between the modules is presented in Figure 1 Cristea and Webber (1997) The form and content of this piece and the place where it is to be included in the existing discourse tree (DT) is proposed and then constrained by a set of knowledge sources We advocate for a parsing process that applies all possible knowledge 1 Minimally a clause, maximally a dot-to-dot compound sources in parallel at each step Knowledge contributedsen tence 2 and to (him) take the life DISCOURSE5 Hera (him) hated son the new-born of Leta, 6 because husband her, all-mighty Zeus,PosTag cared more about him than about sons her:DB pSemExp Hefaistos and Ares ”PlanEx CueExp (the underlined words are the cue-words that led to this p delimitation) InfEx The result provided by PosTag is a database image of an ynExpRefExpDiscPar SGML annotated text, containing morphologicalS information, unit delimitation and noun-phrase tags If a syntactic expert is available, besides morphologicalTempExp features, the description for a unit can include also some syntactic features DiscPar processes the discourse units in order to buildTREES the discourse tree(s) associated to the text On these treesREFERENCES NETSspecific operations (adjoining and substitution) definedSEM for discourse processing, as described in Cristea and Figure 1 The data flow of the system (grey arrows/boxes show compulsory data/components, while white ones – optional) Webber (1997), are used DiscPar builds the current discourse tree (CDT) in an incremental manner For each The process is fully incremental, modelling a human-likeunit, a new structure (an elementary or an auxiliary tree) processing of discourse Although the input is fetched inis inserted into the CDT Elementary trees are substituted sequentially, the overall processing supposes a certainin a substitution place of the CDT, marked by a down degree of parallelism as modules that co-operate canarrow; auxiliary trees are adjoined – its foot node being work concurrently being synchronised by themarked by a star (see also Figure 2) A decision must be request/answer messages exchanged among them taken with respect to the most appropriate structure for enting the current unit and the right place for 3 How does the system work?repres insertion At each step, only the generalised right frontier The plain text constitutes the input for PosTag that is open for insertion (Cristea and Webber, 1997) Criteria processes it sentence by sentence After tokenization and and constraints on choosing the appropriate structures lexical look-up, PosTag annotates the text with specific and places for insertion are requested by DiscPar from lexical information In order to split the text in discourse CueExp, RefExp, TempExp and PlanExp If neither units, PosTag interrogates CueExp An example of the one is available or able to answer, DiscPar tries every heuristics used by PosTag for determining unit bounds is possible structure inserted in every possible place, the following: leading to an exponential number of trees (see , when => ,|when Appendix) In order to validate a set of unit boundary heuristics weIf data are available from CueExp, the search is strongly are currently doing statistics on a Romanian collection ofrestricted CueExp provides heuristics that hints on the texts Lack of this information gets a coarse delimiting ofrhetorical relation, the form of the auxiliary tree and the the discourse into units, the worst possible being a "fullright place for insertion An example of the heuristics, stop" criteria provided by CueExp and used by DiscPar is given in Figure 2 We sketch the way the system works on a sample of Romanian text2 For the given sample, the existing heuristics indicate the following unit boundaries: 1 Piton has received secretly order from Hera 2 to (him) watch Apollo, 3 când (0) va trece prin munte, when he will cross (through) the mountain, 4 2 “ Legendele Olimpului” by Al Mitru 3 IF THEN cue-word = când (when)insertion = adjoining begining of phrase = NO structure = CIRC what = i+1 probability = p1*U place = previous unit where = probability = p2 IF THEN cue-word = când (when)insertion = adjoining structure = structure = structure = ??? what = CIRCCIRCCIRC * *U*U U i+1i+1i+1 probability = p3 probability = p4 probability = p5 place = previous units where= probability = p6 Figure 2 Heuristics provided by CueExp for the cue-word când (when) The latter rule of Figure 2 represents actually a set ofinternal nodes of the tree We do not deal with this three rules with the corresponding structures, each ofaspect in the present paper them having its probability The form of the structureWhen requested by DiscPar, RefExp performs an proposed in the second rule means that the unit to beanaphora resolution process The initial supposition is added is a circumstance for a further unit When this unitthat each reference string (in our approach, noun- will be processed it will take the place of the “ downphrases) realises a center The resolution process consists arrow” node (substitution) The unknown discoursein determining the antecedents of the reference strings in relation (marked by “ ?” ) could possibly be filled inthe text or identifying the functional relations between when the substitution is performed The probabilities arethem Two reference strings co-refer if they realise the computed by doing statistics on a Romanian collectionsame center Two reference strings are in a functional of texts relation if one of the corresponding centers is a role of Once a CDT is built, DiscPar computes all veins andthe other During the resolution process, to every launches queries to RefExp and PlanExp waiting forreference string is associated a list of centers it may scores Depending on these scores, some of the CDT’spossible realise This set is diminished when some of its are rejected, reducing the search space to fewerelements are eliminated by applying multiple filters The discourse structures The final result is the CDT (or thefilters are ordered with respect to the confidence in the set of CDTs) obtained after processing the last unit results provided by each of them The most salient filter will be applied first The most important filter is the It is not unusual that the discourse parsing results in a setmorphological one, which selects from the co-referents of trees, given that discourse can often be inherentlylist only those that agree (in number, gender and, if ambiguous Still it is anticipated that the set actuallynecessary, in person) with the analysed entity Secondly, obtained is larger than the set of actual interpretations, ina semantic filter is applied on the updated list of co- case the expert modules do not contribute with sufficientreferents by delivering it to InfExp, which performs an constraints inferential semantic process in order to eliminate the Every such tree represents a possible interpretation of thecenters that cannot be realised by the reference string It processed discourse The semantics of the overallhas to be mentioned that InfExp cannot always give a discourse could then result in a compositional way fromdefinite answer; if there is even a small chance that a the event description of terminal nodes built for everycenter in the list could be realised by the analysed unit by SemExp and the discourse relations of thereference string, this center has to be conserved in the list of possible centers Beside the co-references, InfExp has another very important role, that of providing the 4functional relations between centers directly realised byreferential one The semantic net contains concepts and noun-phrases; this is the only way the bridge anaphoratheir instances (nodes) and relations (links) between could be identified If there are still ambiguities (i e , thenodes antecedents list has more than one element), a filter based on VT is applied This uses the veins provided byThe conceptual level of a net is incrementally built from DiscPar in the query and consists in two levels The firstthe semantic patterns that are retrieved from an internal one eliminates from the center list those which are notknowledge database By “ pattern” we mean a small realised by reference strings in the accessibility domainsemantic net, corresponding to a word, that contains a of the current unit The second one is a referencenode with an associated feature structure (for a noun), or resolution process based on VT, which can find thea central node having a set of role relations with empty backward-looking center for the unit If, at the end of thetargets (for a verb) There are two types of concepts: filtering process, there are still multiple centers for astatic ones corresponding to abstractions of entities from reference string, several kind of constraints (syntactic,the represented world, and event ones for abstractions of language-specific features), and, finally, even theactions (verbs) Static concepts can refer a noun (most recency principle could be applied in order to force anusually), an adjective, an adverb etc The associated unique center The supposition made is that a nounfeature structure contains lexical information and various phrase may either realise a new center, or refer ansemantic properties (form, colour, size…) Between already created one This is why, after the process isstatic nodes can exist relations such as: PART-OF, HAS- over, there are no unsolved references in the processedAS-PART, ELEMENT-OF-SET, SUBSET-OF Event part of discourse (a noun phrase without a referent isconcepts are composed of a central node, a number of considered a new-introduced discourse entity) Thesatellite nodes that represent associated entities for the number of new concepts can be used to evaluate theaction, and a number of role relations Each satellite success of the reference resolution process In theplays a role that determines the name of the relation that process of deciding among a set of possiblelinks it to the central node For a given event node interpretations we use the heuristic that prefer the one(corresponding to the sense of a verb, usually) there is a with less new centers small set of possible roles such as: AGENT, RECEIVER, OBJECT, INSTRUMENT, GOAL, TIME, A smoothness-score is calculated by taking into account the four kinds of transitions mentioned by CTANIMATE (continuing, retaining, smooth-shift, abrupt-shift), computed using the unit order defined by the veins MONSTER Conforming to the conjectures made by VT, a higher score indicates a better fluency of the discourse ThePiton answer given to the received query from DiscPar is aPERIODICITY, PLACE, DIRECTION total score reflecting both the success of the reference resolution and the discourse smoothness Figure 3 Semantic net for [Piton] Let us sketch the reference process for the clitic -l (him) in the second unit If SynExp is present, it identifies -l,The other layer of the semantic net is the referential one, which, in Romanian is a direct object, the second directwhich contains instances of the nodes in the conceptual object in the clause, as co-referring Apolo If SynExp islevel (see Figure 3) not available RefExp proceeds with heuristics ForSemExp takes one word at a time from a unit and looks instance, the morphological features of the clitic indicatefor its sense in the dictionary It runs a context-based that it is a personal pronoun, masculine, singular, 3rddisambiguation procedure like in Ide (1997), that person, so its possible co-referents on the accessibilitychooses the best scored sense for each word Once the domain (the sequence ) are [Piton] and [Apollo] sense determined, the associated semantic-net pattern is If InfExp exists, RefExp asks it for a supposition on thisretrieved from the dictionary and added to the matter Given that the subject of the clause has alreadyconceptual level of the semantic-net for the current unit been determined (indicated in the Romanian text by aBy ” added” we mean a process performing pattern zero pronoun) as being [Piton], and the agent of thematching and unification trying to fill the empty fields of event to watch must be different from the beneficiary,the feature structures of the nodes (concepts and their InfExp finds [Apollo] to be the most preferred co-instances) The verb is the kernel of the semantic net referent for –l (the rule used by InfExp in order to givebuilt during this process In the end some of the fields of such a result will be described later) the feature structures are filled while some may remain empty Roles of verbs can be satisfied by static nodes, If present, SemExp processes the input concurrently event nodes (semantic net themselves), or can remain with DiscPar it being required by InfExp and unsatisfied It is also possible for a role to be satisfied by PlanExp Below is a description of the functioning of more than one entity SemExp The meanings of units are represented using semantic nets having two levels: a conceptual one and a 5 expression unit by unit For each unit PlanExp takes from SemExp the associated semantic net and extracts[syn=[cat=pp], from a database the “ planning-frame” corresponding tosem=MODE] the verb in the unit The action gets divided into aMODE(0,n) sequence of actions A list is created containing that GENERIC-EVENT sequence and a set of expectations induced by the[syn=[cat=pp, current action After building this list for the current unit,head=[phon="from"]], sem=ANIMATE] PlanExp determines the intersection between this list[syn=[cat=n], and the ones corresponding to previously processedsem=ANIMATE]FROM(1,n) RECEIVEREC(1,n)units Depending on the results obtained, it gives aTO OB(1,n) plausibility score for the vein The vein plausibility scoreMONSTER is computed as a summing of elementary matchingTHING scores that PlanExp gives on a scale from 3 to 0)ev1:v1PitonREC Figure 4 Semantic net for the verb primise (has received),depending on whether the action in the current unit with components added from current discourse unit (the firstmatches with: a sub-action of an action already in the unit in the sample text) plan, a supra-action of an action already in the plan, one of its pre/post-conditions matches with one of the The semantic nets so build are then transmitted topre/post-conditions of an action already in the plan, or InfExp This module assumes that these structures arethere is no matching at all The obtained score is than known facts in the considered world In order to providenormalised and sent to DiscPar as the answer to its logical conclusions, it uses specific means of inferringquery on semantic nets, as in Thomason and TouretzckyFor our example, when unit 5 is processed, the two (1991), and production rules that are associated to verbspossible veins corresponding to different places of (action patterns) These rules are retrieved from ainsertion are: and (as shown in Figure database containing the common sense reasoning rules,6) according to the events that appeared in the text The following reasoning rules will be used by InfExp on our Resulting vein: example: if ELAB• ;Resulting vein: then • if and ev2 & AGENT(ev2)=Y and XY> Figure 5 Examples of rules used by InfExp while processing 4 our sample text CIRC The reasoning process is activated by a request from RefExp The result may be a confirmation for a proposal 23 of center unification, or an answer to a query, such as “Who is X?” In the previous example where a referent for the clitic -l was looked for, InfExp finds [Appolo] inFigure 6 The possible veins corresponding to unit 5 for virtue of the second application and the constraints thatthe three insertion places the agent (X) of the watching event (ev1) must be different from the watched agent (Y of ev2) The score of the vein is null because no matching Besides RefExp, another “ expert“ which imposesoccurs between the action from unit 1 and the one in unit restrictions over the discourse structures proposed by5 On the contrary, the score of the vein is e 7) DiscPar is the Planning Expert PlanExp provides thepositive because there is a match (see Figur level of trust given by planning theory regarding theTempExp uses the morphological tense of the verb and place of insertion chosen for the unit currentlyheuristics that link its semantics with either a moment or processed We follow, in general, the vision on planningan interval to make time-related predictions For instance described in Young, Pollack and Moore (1994) Planin the sequence , TempExp should report that building is usually a left-to-right process that tries tothe logical time of event 2 (a watching event in present match the current unit action with those in the previoustense, interval) follows the logical time of event 1 (a units Still, in our approach, the sequence tried is that ofreceiving event in past tense, moment), and the logical the argumentation lines given by VT and not that of thetime of event 4 (a killing event in present tense, moment) surface string of units When PlanExp receives fromfollows the logical time interval of event 2 Then event DiscPar a query about a vein plausibility, such as "How5, a hating event expressed with a past continuous tense plausible is vein ?", it starts processing the vein(Romanian imperfect) precedes the logical time of the 6event 4 These data, which could be of importance for Rule TC1: the final semantic construction, are deposited into theif database cue word = cînd (when) unit 4relation = CIRCUMSTANCE(ev1, ev2) ACTION: to killtense(ev1) = past-tense] ACTORS: killer, victimthen tense(ev2) = past-tense RESTRICTIONS: being(killer),probability= 1 PRECONDITIONS: alive(victim) EFFECTS: dead(victim) DECOMPOSITION: -Rule TC2: EXPECTATIONS: to dieif e word = cînd (when) unit 5cu relation = CIRCUMSTANCE(ev1, ev2) ACTION: to hate tense(ev1) = present ACTORS: hater, hatedthen RESTRICTIONS: person (hater),tense(ev2) = present tense(ev2) = past-tense being (victim), probability= p1 probability= p2 person (hated), nse(ev2) = future PRECONDITIONS: -te probability= p3 EFFECTS: - DECOMPOSITION: - EXPECTATIONS: to harm, Figure 9 Two temporal constraining rules to kill, punish4 Conclusions and further workto Figure 7 An example of matching taken into account by Kurohashi and Nagao (1997) propose an automatic PlanExp method for detecting the discourse parse tree based only This performance is dictated by two kind of rules:on clue information They claim that semantic temporal order rules (TO rules) that decide on the orderknowledge is not a reliable source due to the problems of of logical time (moments, intervals) of the events and themanual and automatic coding of information While we temporal constraining rules (TC rules) that evaluatesagree that building an exhaustive semantic knowledge plausibility of trees based on tense considerations Thesystem for a real case application designed to do following rule is one of the first category while the twodiscourse parsing is a difficult task, still we believe that a rules in Figure 9 are of the second one system prepared to accept such kind of knowledge the moment when it is available and which can live without it when it is not available would be the happiest TO1combination The architecture that we propose, althoughRule not extensively tested, must be able to get the sameif = (to)accuracy when fed only with surface information as thecue word = ANY(ev1, ev2)Kurohashi and Nagao's system, but better and betterrelation = (tb1, te1)results the more knowledge sources are available andtime(ev1) , te2) = (tb2their knowledge is more and more reliable and complete time(ev2) then tb2The paper proposes a model for a system aimed at doingtb1 discourse understanding that integrates a shallow parser and an incremental discourse parser with a couple of Figure 8 A temporal order rule optional modules, each contributing with specific expertise (cue-words, syntactic, semantic, temporal, In our example, the moment the third unit is to beinferential, referential, planning) adjoined in the existing structure, after CueExp has given two places of adjoining, with differentThe approach advocates for a pragmatic plug-in probabilities (node u2, and node ELAB, see the figure inphilosophy able to develop, without any changes at the appendix),architectural level, a more and more reliable system in the rhythm these modules are available and their DiscPar interrogates also TempExp if this is available knowledge is richer and richer Because the tense of unit 1 is past-tense, and that of unit 3 is future, Rule TC1 will invalidate adjoining of unit 3The discourse parser does an incremental discourse tree in the node ELAB (as this would yield a relationbuilding guided by the principles of the Veins Theory CIRCUMSTANCE(ev1, ev3)) Rule TC2 validatesThe use of semantic nets provides logical support for the adjoining in unit 2 with the probability p3 analysis, and based on a sufficient amount of common sense knowledge, it guides the processing activity on the 7right way In developing common sense rules for the35th Annual Meeting of the Association for inference module and the planning module, work isComputational Linguistics (pp 88-95), Madrid needed for finding how this knowledge can be deducedGardent C (1997) Discourse TAG Claus report nr 89, in a (semi)automatic way from existing corpora University of the Saarland, Saarbruecken The architecture proposed uses intensively knowledgeGrosz B J , Joshi A K , & Weinstein S (1995) acquired from corpus study We revise this shortly forCentering: A Framework for Modeling the Local each module: the tagger is mainly based on a model ofCoherence of Discourse, Computational Linguistics, language that is build within the MULTEXT project21(2), 203-225 Ide N (1997) Word Sense Disambiguation Lecture at (Tufis and Mason (1998)) The knowledge CueExp hasthe Eurolan’97 Summer School on Corpus Linguistics, is build on a corpus investigation that has looked for cueBile-Tunad words and their correlation with discourse structure Joshi A (1987) An Introduction to Tree Adjoining Although we have not done yet this, we intend to use a Grammar In Alexis Manaster-Rammer (ed) corpus also to refine and enlarge the semantic Mathematics of Language knowledge of SemExp as well as the planning knowledge of PlanExp In the case of TempExp weKurohashi S and Nagao M (1997) Automatic Detection intend to organise a corpus investigation that would helpof Discourse Structure by Checking Surface in acquiring more accurate temporal knowledge Information in Sentences Proceedings of Coling'97, SemExp and InfExp are two modules whose expertise isKyoto, pp 1123-1127 critically dependent on data obtained from corpora TheMann W C & Thompson S A (1987) Rhetorical validation of Veins Theory itself, as a theory of globalStructure Theory: A Theory of Text Organisation, discourse, and consequently the expertise displayed byText, 8(3), 243-281 DiscPar and RefExp, is deeply dependent on a corpusMarcu, D (1996) Building up Rhetorical Structure study, as shown in Cristea, Ide and Romary (1998a) andTrees, Proceedings of the 13th National Conference Cristea, Ide and Romary (1998b) on AI (AAAI-96), 2, Portland, Oregon, pp1069-1074 Schabes Y (1990) Mathematical and Computational The modules of the system are currently in differentAspects of Lexicalized Grammars Technical Report phases of development and testing on a Windows PCMS-CIS-90-48, LINC LAB 179 platform They are written in Java and C++ excepting for the PosTag and partly CueExp which are implemented asSchilder F (1997) Tree Discourse Grammar, in CLIPS rules The access to the database is provided byProceedings of the Amsterdam Colloquium and the an SQL server Modules communicate using TCP/IPInternational Workshop on Computational Semantics which allows for them to run either on an uniqueThomason R , Touretzcky D (1991) Inheritance Theory computer or, if available, on different machines across aand Networks with Roles, in J F Sowa (ed) Principles net At least theoretically, this design also permits thatof Semantic Networks different copies of modules that are extensively used co-Tufi D and Mason O (1998) Tagging Romanian Texts: exist independent one of the other, reducing this way thea Case Study for QTAG, a Language Independent workload Probabilistic Tagger, Proceedings of the First ernational Conference on Language Resources and AcknowledgementsInt Evaluation, Granada Young R M , Pollack M E , Moore V (1994) the Romanian language model used in PosTag Decomposition and Causality in Partial Order anning ReferencesPl Brennan S E , Walker Friedman M and Pollard C J (1987) A centering approach to pronouns In Proc 25th Annual Meeting of ACL, Stanford (pp 155-162) Cristea D , Ide N and Romary L (1998a) Marking-up multiple views of a Text: Discourse and Reference Proceedings of the First International Conference on Language Resources and Evaluation, Granada Cristea D , Ide N and Romary L (1998b) Veins Theory - A Model of Global Discourse Cohesion and Coherence Proceedings of Colling/ACL’98, Montreal, forthcoming Cristea D and Webber B L (1997) Expectations in Incremental Discourse Processing Proceedings of the 8 u5 ? u4 JOIN u3 ELAB CIRC u2u1 u5 ? u4 ELABJOIN DiscParu3 u1 CIRC u2u5 *? u4u5 ? JOIN u3 ELAB CIRC u2u1 = Possible place for insertion as proposed by Node open only for substitution= Nuclear nodeCueExp= Legend:u2 u5 ? u4 u4 JOIN u3TempExp JOIN u3 ELAB CIRC ELABu1 CIRCu2 u2u1 u4 u3JOIN *u5 ? u4 9*CIRCu2 ELABJOIN u3 *ELABu1CIRC u3u2 u5 CIRC u2? ELABu6 u5u2 *u4 Bu1? u1 JOINMOTIVELAB u3u5 u1 ELAB CIRC JOIN u2u1u4 JOIN u5u3 CIRCu5ELABCIRC ?JOINu1 *u2 *u3JOIN u4 u6 JOIN u3 ELABMOTIV CIRCu5CIRC CIRCu2u1?u3 ?u3 u4JOIN* u2u5 ELABELABJOIN u1u4JOINu3 u1 CIRCCIRC ELABJOIN?CIRC u3u2u3 u3 u1*? CIRC u2u6u2ELAB u1CueExp u5MOTIV u4u5 u4CIRC JOINJOINJOIN u3?u3 JOIN ELABu3 u4CIRCu2 ELAB u2u1ELABCIRC Appendix AnJOINu1u1 example of parsing performed by *u2